Deep encoding of etymological information in TEI

نویسندگان

  • Jack Bowers
  • Laurent Romary
چکیده

In this paper we provide a systematic and comprehensive set of modeling principles for representing etymological data in digital dictionaries using TEI. The purpose is to integrate in one coherent framework both digital representations of legacy dictionaries and born-digital lexical databases that are constructed manually or semi-automatically. We provide examples from many different types of etymological phenomena from traditional lexicographic practice, as well as analytical approaches from functional and cognitive linguistics such as metaphor, metonymy and grammaticalization, which in many lexicographical and formal linguistic circles have not often been treated as truly etymological in nature, and have thus been largely left out of etymological dictionaries. In order to fully and accurately express the phenomena and their structures, we have made several proposals for expanding and amending some aspects of the existing TEI framework. Finally, with reference to both synchronic and diachronic data, we also demonstrate how encoders may integrate semantic web/linked open data information resources into TEI dictionaries as a basis for the sense, and/or the semantic domain of an entry and/or an etymon.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Markup of Korean Dictionary Entries

Dictionary markup (encoding) is one of the concerns of TEI (Text Encoding Initiative), an international project for text encoding. In this paper, we investigate ways to use and extend TEI encoding scheme for the markup of Korean dictionary entries. Since TEI suggestions for dictionary markup are mainly for western language dictionaries, we need to cope with problems to be encountered in encodin...

متن کامل

Representing TEI Documents in the CLASSIC Knowledge Representation System

The development of the Text Encoding Initiative (TEI) Guidelines enables the encoding of a wide variety of textual phenomena to any desired level of fine-grainedness and complexity, relevant to a broad range of applications and scholary interests. The ability to encode complex phenomena has, in turn, created a demand for adequate means to manipulate the text once it has been marked up according...

متن کامل

Encoding Biomedical Resources in TEI: The Case of the GENIA Corpus

It is well known that standardising the annotation of language resources significantly raises their potential, as it enables re-use and spurs the development of common technologies. Despite the fact that increasingly complex linguistic information is being added to biomedical texts, no standard solutions have so far been proposed for their encoding. This paper describes a standardised XML tagse...

متن کامل

Teaching TEI: The Need for TEI by Example

The Text Encoding Initiative (TEI) has provided a complex and comprehensive system of provisions for scholarly text encoding. Although a major focus of the ‘digital humanities’ domain, and despite much teaching effort by the TEI community, there is a lack of teaching materials available, which would encourage the adoption of the TEI’s recommendations and the widespread use of its text encoding ...

متن کامل

Using ODD for Multi-purpose TEI Documentation

The philosophy of "literate programming" (Knuth 1984), on which the TEI ODD is founded, proposes that code and documentation be written and maintained as a single integrated resource, from which both working programs and readable documentation can be generated. As currently designed, the TEI ODD system supports these goals, and there exist several good examples of extended project documentation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.10122  شماره 

صفحات  -

تاریخ انتشار 2016